perfect memory
The Yokai Learning Environment: Tracking Beliefs Over Space and Time
Ruhdorfer, Constantin, Bortoletto, Matteo, Bulling, Andreas
Developing collaborative AI hinges on Theory of Mind (ToM) - the ability to reason about the beliefs of others to build and maintain common ground. Existing ToM benchmarks, however, are restricted to passive observer settings or lack an assessment of how agents establish and maintain common ground over time. To address these gaps, we introduce the Yokai Learning Environment (YLE) - a multi-agent reinforcement learning (RL) environment based on the cooperative card game Yokai. In the YLE, agents take turns peeking at hidden cards and moving them to form clusters based on colour. Success requires tracking evolving beliefs, remembering past observations, using hints as grounded communication, and maintaining common ground with teammates. Our evaluation yields two key findings: First, current RL agents struggle to solve the YLE, even when given access to perfect memory. Second, while belief modelling improves performance, agents are still unable to effectively generalise to unseen partners or form accurate beliefs over longer games, exposing a reliance on brittle conventions rather than robust belief tracking. We use the YLE to investigate research questions in belief modelling, memory, partner generalisation, and scaling to higher-order ToM.
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.87)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
Replay-enhanced Continual Reinforcement Learning
Zhang, Tiantian, Shen, Kevin Zehua, Lin, Zichuan, Yuan, Bo, Wang, Xueqian, Li, Xiu, Ye, Deheng
Replaying past experiences has proven to be a highly effective approach for averting catastrophic forgetting in supervised continual learning. However, some crucial factors are still largely ignored, making it vulnerable to serious failure, when used as a solution to forgetting in continual reinforcement learning, even in the context of perfect memory where all data of previous tasks are accessible in the current task. On the one hand, since most reinforcement learning algorithms are not invariant to the reward scale, the previously well-learned tasks (with high rewards) may appear to be more salient to the current learning process than the current task (with small initial rewards). This causes the agent to concentrate on those salient tasks at the expense of generality on the current task. On the other hand, offline learning on replayed tasks while learning a new task may induce a distributional shift between the dataset and the learned policy on old tasks, resulting in forgetting. In this paper, we introduce RECALL, a replay-enhanced method that greatly improves the plasticity of existing replay-based methods on new tasks while effectively avoiding the recurrence of catastrophic forgetting in continual reinforcement learning. RECALL leverages adaptive normalization on approximate targets and policy distillation on old tasks to enhance generality and stability, respectively. Extensive experiments on the Continual World benchmark show that RECALL performs significantly better than purely perfect memory replay, and achieves comparable or better overall performance against state-of-the-art continual learning methods.
Autonomous Visual Navigation A Biologically Inspired Approach
Athanasoulias, Sotirios, Philippides, Andy
Inspired by the navigational behaviour observed in the animal kingdom and especially the navigational behaviour of the ants, we attempt to simulate it in an artificial environment by implementing different kinds of biomimetic algorithms. Ants navigate themselves by using retinotopic views and try to move in a position to perceive the world in way to look more like they have memorized it. Using this concept, we implement one robust method, "Perfect Memory", which uses the Snapshot model. Perfect Memory is based on the unrealistic assumption of remembering every single snapshot experienced across a training route. After evaluating the performance of this technique and confirming its robustness, we approach the same problem using Artificial Neural Networks (ANNs) as classifiers. This approach has the advantage of providing a holistic representation of the route and the agent does not need to memorize every single snapshot. The basic idea is that we train an ANN to classify whether a view is part of the route or not using the snapshots as training data. We aim to explore and compare the performance between different ANNs classification techniques using as baseline the Perfect Memory.
Semantic Technology Trends in 2022 - DATAVERSITY
Semantic technology trends are expanding well beyond an interesting, more advanced search engine. Besides providing scientists with a more functional search engine, semantic technology is now being used to improve artificial intelligence and machine learning. Semantic technology uses a variety of tools and methods designed to add "meaning" to a computer's understanding of data. When asked a question, rather than simply searching for keywords, semantic technologies will explore a wide variety of resources for topics, concepts, and relationships. In the financial and science industries, companies have begun to semantically "enrich" content, processing complex data from a variety of sources.
An Algorithm for Learning Smaller Representations of Models With Scarce Data
We present a greedy algorithm for solving binary classification problems in situations where the dataset is either too small or not fully representative of the problem being solved, and obtaining more data is not possible. This algorithm is of particular interest when training small models that have trouble generalizing. It relies on a trained model with loose accuracy constraints, an iterative hyperparameter pruning procedure, and a function used to generate new data. Analysis on correctness and runtime complexity under ideal conditions and an extension to deep neural networks is provided. In the former case we obtain an asymptotic bound of $O\left(|\Theta^2|\left(\log{|\Theta|} + |\theta^2| + T_f\left(| D|\right)\right) + \bar{S}|\Theta||{E}|\right)$, where $|{\Theta}|$ is the cardinality of the set of hyperparameters $\theta$ to be searched; $|{E}|$ and $|{D}|$ are the sizes of the evaluation and training datasets, respectively; $\bar{S}$ and $\bar{f}$ are the inference times for the trained model and the candidate model; and $T_f({|{D}|})$ is a polynomial on $|{D}|$ and $\bar{f}$. Under these conditions, this algorithm returns a solution that is $1 \leq r \leq 2(1 - {2^{-|{\Theta}|}})$ times better than simply enumerating and training with any $\theta \in \Theta$. As part of our analysis of the generating function we also prove that, under certain assumptions, if an open cover of $D$ has the same homology as the manifold where the support of the underlying probability distribution lies, then $D$ is learnable, and viceversa.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- (4 more...)
Optimal Continual Learning has Perfect Memory and is NP-hard
Knoblauch, Jeremias, Husain, Hisham, Diethe, Tom
Continual Learning (CL) algorithms incrementally learn a predictor or representation across multiple sequentially observed tasks. Designing CL algorithms that perform reliably and avoid so-called catastrophic forgetting has proven a persistent challenge. The current paper develops a theoretical approach that explains why. In particular, we derive the computational properties which CL algorithms would have to possess in order to avoid catastrophic forgetting. Our main finding is that such optimal CL algorithms generally solve an NP-hard problem and will require perfect memory to do so. The findings are of theoretical interest, but also explain the excellent performance of CL algorithms using experience replay, episodic memory and core sets relative to regularization-based approaches.
- Europe > Austria > Vienna (0.14)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.63)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)